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Needed Improvements = y/?/ 7 
In Test Norms 


An editorial by Alfred S. Lewerenz 
Director, Evaluation and Research 
Los Angeles City School Districts 


The last thirty years have seen a tremendous growth in the construction 
and distribution of educational tests and measures of mental ability. The 
areas of measurement now include all important areas of learning and also 
the aptitudes related to the many occupations available in the world of work. 
Intelligence tests have sought to measure mental growth from a wide variety 
of standpoints. Likewise, interest inventories and measures of temperament 
have sought for better pupil adjustment. 

This wide variety of measures has been produced under competitive 
conditions in the true American spirit. Tests have survived on the basis of 
merit and the less effective have dropped by the wayside. Because these 
measures were a product of individual enterprise and initiative, they show 
a wide range of approaches to the problem. Moreover, some incorporate 
certain features which are perhaps not too desirable. The time has perhaps 
come when certain methods and techniques of test construction and inter- 
pretation should be reviewed. Undesirable practices should be eliminated 
and agreed upon procedures should become more or less standard practice. 

The automotive industry long ago faced a similar problem. Even though 
it was a highly competitive business, there were certain things that it was 
mutually possible for all to agree upon. One of the functions of the Society 
of Automotive Engineers has been to bring about certain standardized 
procedures in the nature of common elements, such as bolt sizes and threads, 
oil grades, bumper heights and other interchange of common elements. The 
test publishing business is in need of some such working group to bring 
about standardization which will be of mutual benefit to both publisher 
and consumer, There are at least ten areas where publishers could materially 
improve their tests: 

I. Certain group intelligence tests seem consistently to yield higher or 
lower average I.Q.’s than found from group intelligence tests given at other 
grades. In other words, the consistency of intelligence measurement is not 
sufficiently in agreement from one test to another. Some common yardstick 
is needed so that a uniform set of standardization values may be derived 
for mental age. Where two intelligence tests are standardized on two 
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different types of population, it is altogether possible to get two different 
means and ranges for pupils of the same average age. Therefore, if the 
individual pupil is given one test, his I.Q. may appear to be 100. Given 
the other test, the I.Q. may appear to be 106. Such lack of agreement due 
to variation in the standardization population is undesirable and some sort 
of “Bureau of Standards” values should be established as criteria for 
validation. 

II. Tests supplied with interpretation by age for achievement very often 
will give grade placement equivalents. When one examines, let us say, an 
age of 152 months, the grade placement may be 7.0, whereas on another 
test the grade placement value for the same age may be as high as 7.7. 
Such variability in such an essential interpretive aid is unthinkable. It is 
like having on one ruler, twelve inches to a foot and on another ruler, 
fourteen inches to a foot. Any test worthy of being called a nationally 
standardized measure, should be able to show agreement between educa- 
tional ages and grade placement equivalents in line with current age-grade 
relationships as found commonly throughout the United States. 

III. For the most part, in the case of achievement tests, only one type 
of norm interpretation is provided by a publisher. These norms are usually 
based on an assumed average population of perhaps 100 average I.Q. Such 
norms are really only applicable in a school or in a community where the 
average I.Q. is also 100. For a school with an average I.Q. of 93 or 111, 
such norms are not an adequate measure. Test norms should have sufficient 
variability so that a given school or school system can find out how well it 
compares with other schools with the same average chronological age and 
intelligence. Such a step would prevent self-satisfaction on the part of a 
school with a superior population and prevent unnecessary discontent on 
the part of a school with a low population. 

IV. Renewed interest on the part of the school people in the education 
of the gifted calls for more refined norms for top level attainment and 
achievement. Good guidance calls for the conservation of ability of pupils 
of great attainment. In many cases present test norms, while revealing that 
a pupil is outstanding, do not especially point out that he may be one in a 
million rather than one perhaps in a hundred or a thousand. Some sort of 
more finely calibrated norms at the upper levels of tests would serve to 
impress counselors with the importance of adequate guidance for individuals 
of rare attainment. ‘ 

V. Since a number of schools are developing instructional programs 
which minimize grade organization but work on a basis of maturity levels 
related to mental age and educational achievement, tests should be so 
normed as to yield an interpretation in terms other than the customary grade 
placement. One approach would be to provide interpretive data on the 
basis of achievement averages by increments of age and intelligence. 

VI. Educators anxious to improve their instructional program on the 
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basis of standardized tests are often baffled by the lack of information about 
standardization groups used in constructing norms. Test publishers should | 


provide with any set of norms, facts for the standardization group with © 


regard to ageness, intelligence, type of promotion plan, rural or urban popu- | 
lation, compulsory education laws, economic level and social background. | 
It is natural for a big city to wish to compare its results with other big city © 
populations having about the same background of experience and educa- | 
tional facilities. Likewise, a rural district would wish to compare its results © 
with those of comparable districts gathered on a national basis. 

VII. Users of tests are often dismayed to discover that a test that they 
have used regularly suddenly has been provided with a new set of norms. 
These norms do not always seem to be in line with actual conditions. The 
revision of test norms should be subject to certain professional standards 
with regard to the basis for making the revision as well as the introduction 
of the norms in a manner that will not seriously disturb an on-going testing 
program. 

VIII. When norms have been revised on an adequate basis, it is desirable 
that interpretation data be provided which will make possible a comparison 
of results over a period of years. In other words, it is frequently desirable 
to know what the achievement test results for a certain group would have — 
been had not the norms been revised. This is particularly true when an | 
experimental tryout is being made over a period of years, with “before” and 7 
“after” testing. When a change in tests is made during such a tryout it is © 
frequently difficult to get an adequate measure of the effect of the program. | 

IX. Test norms used in the evaluation of school achievement should be 
sufficiently varied to have meaning to the pupil, parent, teacher, supervisor, 
principal, superintendent and public. These results have different meanings 
and applications for those who make use of them. Great ingenuity is needed 
to make test results of value from the standpoint of curriculum, instruction, 
guidance, administration and public support. Too often test norms seem to 
be designed only for the use of classroom teachers. 

X. Test norms should have provision for a form of interpretation that 
can be understood by the lay public. Such a development would be based | 
partially upon the suggestions contained in items III and V. It means little ~ 
to the general public to hear that the A-12 graduating group stood at the | 
68th percentile for the nation, Some citizens will undoubtedly wonder why > 
their school system shouldn’t stand at the 100th percentile—100 in many © 
people’s minds standing as a symbol of perfection. Perhaps some sort of d 
descriptive phrases could be suggested for certain types of performance, _ ‘ 
such as: “X” school system did a quality of work ten per cent higher than | 
might be expected, or “Y” school indicated an efficiency of 103 per cent | q 
and “Z” school is at a level above what might be expected in every subject | a 
measured with the exception of arithmetic, where plans for genre 
are under way. 
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Two professional organizations have taken the lead in establishing 
specifications for test construction and standardization. These recommenda- 
tions include many highly important technical and statistical procedures. 
The American Psychological Association has made a number of such recom- 
mendations for building intelligence scales, Likewise, committees of the 
American Educational Research Association and the National Council on 
Measurements Used in Education have produced an excellent manual for 
test publishers with regard to the preparation of achievement measures. 

Both of these proposals made by our professional organizations point 
the way to better tests. 

It may be hoped that within the next few years, either through the 
professional organizations concerned with testing, or through cooperation 
of the test publishers themselves, that there can be additional steps taken 
for improving the methods of test standardization and interpretation. Such 
steps need in no way diminish the competitive aspects of the test business 
and they would immeasurably increase the satisfaction of customer use of 
such measures. 


Supplement to List of 
Dissertations in Education— 1954-1955 


The following doctoral dissertations were inadvertently omitted ‘from 
the list of dissertations accepted by California colleges and universities 
during the 1954-1955 academic year in the field of education. This list was 
published in the November, 1955, issue of the California Journal of Educa- 
tional Research. 


Forbes, Robert J. Aspects of School-Community Tension. Claremont Graduate 
School. 
Patterson, Franklin K. Organized Criticism of Public Schools Viewed by Educa: 
tional ong A Study of Opinion. Claremont Graduate School. 
eae aie A Study of the Relationship of Social Status in a High School 
lass to 


Senior the Selection of Teaching as a Career. Claremont Graduate 
School. 








Problems Associated with Intelligence Testing 
In a Large City District 


Howarp A. BOWMAN 


Possibly one of the more baffling problems associated with a testing 
program in a large city school district is one which obtains in Los Angeles, 
and which may be typical of large cities, although no data are available to 
indicate that this is the case. This is the problem which arises when the 
district, considered as a total, stops looking like a single school district and 
takes on the aspect of several districts, each with its own characteristics, 
and each large enough to constitute a good sized school district in itself. 


The Testing Program 


Perhaps a brief overview of the testing program which has operated in 
Los Angeles for about ten years will be in order at this point, so that later 
observations will be clarified. The minimal testing program in Los Angeles 
calls for intelligence testing of each pupil at the opening of each odd- 
numbered year, i.e., 1, 3, 5, and so forth. The same program calls for testing 
of reading vocabulary and comprehension during each semester of the third 
grade, and for the administration of a complete achievement battery to each 
pupil once every two years thereafter. At the elementary level this is 
accomplished by testing approximately half of the pupils every fall, and 
at the secondary level by testing all pupils every other spring, the junior and 
senior high schools testing during alternate years. 

The interpretation of achievement testing is done in two ways. First 
is the usual interpretation by means of comparing school and system means 
with the grade norm. Second, however, is a method of comparison which 
involves computation of an “expected achievement grade placement” by 
means of formulae developed some years ago, and involving the use of 
mental age and chronological age in specific proportions, the proportions 
depending upon the chronological age level.’ 


1 Horn, Alice McAnulty, Uneven Distribution of the Effects of Specific Factors, 
Southern California Education Monographs, Number 12, Los Angeles: University 
of Southern California Press, 1941. 


Howard A. Bowman is Supervisor of Measurement and Evaluation for the 
Los Angeles City School Districts, a position he has held for the past nine years. 
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The use of an expectancy formula accomplishes the result of placing 
each pupil and each school upon an equal footing, thus making possible a 
somewhat better estimate of status or of growth than may ordinarily be 
derived from inspection of achievement test results in terms of grade norm, 
It will be observed that, chronological age being a natural function of the 
individuals involved, and incapable of effective change except through 
computational error, the accuracy of the ultimate interpretation depends 
upon the accuracy of measurement on the part of the achievement test, and 
the accuracy of measurement on the part of the intelligence test. 

In those instances in which all, or nearly all, schools employ the same 
instruments for measuring intelligence and achievement, the problem is not 
as severe in matters of interpretation. Whatever measurement error exists, 
it is at least constant for all measurements, and allowances may be made. 
This is the case with the tests used at both junior and senior high school 


levels. Virtually all pupils are tested by means of the same instruments at 
these levels. 


The Elementary School Situation 


At the elementary school level, however, a different situation exists. It 
is here, too, that the characteristic of variability between parts of the district 
which was mentioned in the opening paragraph begins to play a large 
part. The elementary school district, with its approximately 375 schools, 
is too large for efficient single-unit administration. It has been divided, 
therefore, into five districts, each under the administration of an Assistant 
Superintendent, and and each with its own supervisory staff. One of these 
districts alone, the burgeoning Valley District, which encompasses the San 
Fernando Valley, may be ranked, numerically, within the first ten elemen- 
tary school systems in the United States. 

Among these five districts a large degree of autonomy exists. Moreover, 
each is treated as a separate school district so far as the testing program is 
concerned, particularly with reference to interpretation. Although all schools 
in the city employ the same achievement measures at the same grade levels, 
and test at approximately the same time (within a four-week span), the 
determination of which intelligence test shall be used is largely the province 
of the individual school principal, and depends to a large degree upon 
which test or tests teachers in the school may have been trained to admin- 
ister. It is at precisely this point that problems begin to arise, for here the 
interpretation in terms of expectancy, generally regarded as being most 
useful to the teacher and to the supervisory staff of the district, may begin 
to develop inaccuracies. These inaccuracies, if they exist, are traceable to 
whatever variability may exist between the measuring qualities of the several 
intelligence tests which may be employed at any given grade level, as well 
as between the measuring qualities of whatever intelligence tests may be 
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employed at successive testing points, since all results are recorded for 
pupils upon cumulative record cards. 


The problem is one which is not visible in over-all results, except through 
suspicion that the average I.Q.’s do not appear to coincide with other known 
qualities of schools or districts. In fact, as will be seen later, there is, as 
yet, no direct evidence that a problem actually exists, although suspicion 
has been increased through a study of certain data. 


Over a period of several years, evidence began to accumulate that certain 
districts tended to show higher average I.Q.’s than did others. Moreover, 
in these districts, the average I.Q.’s were substantially above the 100 mark, 
and the differences were somewhat greater at the first grade level than at 
higher grades. It was judged through supplementary evidence that some 
differences should exist, but that the magnitude of these differences, espe- 
cially at the first grade level, appeared to be greater than should have been 
the case. It was decided that a study of the intelligence tests used would 
be appropriate, and such a study was initiated in the fall of 1954. 


Special Study in Grade One 


Special data sheets were printed for the use of teachers of beginning 
first grade pupils. These data sheets provided spaces for listing pupil 
names, birth dates, chronological ages, mental ages, intelligence quotients, 
the names of the tests used, and other supplementary data relative to the 
number of semesters spent by each pupil in kindergarten and whether the 
pupil was repeating the first grade or the first semester thereof. The latter 
named data are not important to this report. 


Because of the nature and magnitude of the testing program already 
arranged for the schools, a decision had to be reached as to how far this 
special study of first grade data would go. In order to limit the study to 
that which was possible within the confines of the available teacher time 
and the budget for purchasing tests, it was decided that adequate data 
might be secured by simply recording the results of the testing already 
scheduled for the first grade. All schools were already expecting to admin- 
ister either Test “A” or Test “B,”? and the study would impose no additional 
burden other than that of recording the data on the special sheets. 


Data recorded on the data sheets were punched into IBM cards so that 
they might be sorted and tabulated in a number of ways. After the tabula- 
tions had been delivered and subjected to a number of calculations, it was 
deemed advisable to continue the study into the second semester, securing 
similar data for pupils tested as entering first graders in February, 1955. 
This continuation of the study was based on the premise that comparative 


? Because the study is incomplete, the two tests involved will be designated 
Test “A” and Test “B.” 
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study of fall and spring groups at higher grade levels had shown substantial 
differences in I.Q. and achievement to exist, these differences virtually 
always in favor of members of the fall group. It appeared desirable to 
secure data as to the fundamental nature of such differences as opposed to 
their creation through retention, or for other administrative reasons. More- 
over, the first grade data obtained in the fall seemed to present such an 
unusual aspect that it was felt wise to attempt to verify or reject them. 

Table I presents the means of I.Q. and the standard deviations derived 
from the administration of the two tests used at the first grade level for both 
the fall semester, 1954, and the spring semester, 1955. 


TABLE | 
Standard Deviations and Mean Intelligence Quotients Derived From the 
Administration of Intelligence Tests to Beginning First Grade Pupils, 
Fall 1954 and Spring 1955 





Fall, 1954 Spring, 1955 
Test A 
Mean IQ 106.05 108.03 
Standard deviation 17.04 17.11 
Number (N) 16,049 11,432 
Test B 
Mean IQ 100.17 101.62 
Standard deviation 10.94 11.61 
Number (N) 6,052 3,831 








It will be noted that the measurements obtained by means of either 
Test “A” or Test “B” are reasonably consistent when fall results are com- 
pared with spring results, In neither instance is there a large change in 
I.Q. or standard deviation, the I.Q. change amounting to less than two 
points in the largest change, and the standard deviation change not exceed- 
ing 0.7 in either instance. Most significant, however, is the obvious 
difference which exists between measurements made by means of Test “A,” 
and those made by means of Test “B.” Here, whether fall or spring group 
is considered, there are large differences in mean I.Q. and large differences 
in the standard deviation. While the magnitudes of the differences in 
standard deviation are mainly interesting from a technical standpoint, they 
also have instructional implications relating to the placement of pupils 
whose I.Q.’s are substantially above or below average and the provision of 
instructional programs for them, particularly in those instances in which 
those measured by means of Test “A” are compared with those measured 
by means of Test “B.” 
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Effect of Variations Studied 


It was pointed out earlier that one basis of interpretation was that 
afforded through the use of an expected achievement grade placement 
which was calculated from the mental age and chronological age by means 
of a formula, the specific proportions of which were based upon the chrono- 
logical age level. With the high variation in I.Q. shown in Table I, the 
question now arises whether the calculation of the expected achievement 
grade placement might not be unduly affected by such variations. It was 
determined that distributions of mental and chronological ages might cast 
some light upon this question, and such distributions were prepared. 
Because of the widely differing numbers of pupils tested by means of the 
two tests in question, the distributions were cast in percentages in order 
to make use of comparable figures within specific age intervals. These 
distributions are shown in Table II and Table III. 


TABLE II 


Percentage Distributions of Chronological Ages of Beginning First Grade 
Pupils Tested by Means of Two Different Intelligence Tests, 
Fall 1954 and Spring 1955 











ie Per Cent Falling in Each Age Interval 
C. er Fall, 1954 Spring 1955 
ge 
Intervals Test A Test B Test A Test B 
9-6 to 9-11 01 
90 to 9-5 wr ee a 
8-6 to 8-11 01 ws 01 
8-0 to 8-5 .04 es as - 
1-6 to 7-11 ai 5S 01 03 
7-0 to 7-5 46 82 25 .26 
66 to 6-11 5.46 9.50 97 1.64 
6-0 to 6-5 68.61 72.35 47.49 55.91 
56 to 5-11 25.27 17.11 50.98 42.09 
5-0 to 5-5 .04 .07 .28 .08 
Total 100.01 100.00 99.99 100.01 
Mean Chronological Age 6-2 6-2 6-0 60 
Number (N) 16,126 6,084 11,433 3,835 








Table II and Table III, particularly the latter, illustrate clearly the 


extent to which the differences earlier observed in I.Q. and standard | 


deviation are traceable to .the two measures used. Chronological ages 


for the groups are identical as to mean in the fall and again in the 
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spring. The two month difference between mean chronological ages of 
fall pupils as against spring pupils is probably due to the greater span of 
months from which entering first graders may legally be drawn in Septem- 
ber as compared with February. However, the rather astonishing differences 
in mental ages, with reference both to mean and to distribution, which 
appear in Table III give considerable food for thought. Test “A” for the 


TABLE Ill 


Percentage Distributions of Mental Ages of Beginning First Grade Pupils 
Tested by Means of Two Different Intelligence Tests, 
Fall 1954 and Spring 1955 


Per Cent Falling in Each Age Interval 


a Fall, 1954 Spring, 1955 
ge 
Intervals Test A Test B Test A Test B 
9-6 to 9-11 01 re ea 
9.0 to 9-5 20 aa 10 
8-6 to 8-11 1.73 a4 1.14 oT 
8-0 to 8-5 5.43 05 4.57 .03 
7-6 to 7-11 12.37 25 11.10 .08 
7-0 to 7-5 15.65 8.15 16.26 5.15 
6-6 to 6-11 17.97 32.60 18.33 28.87 
6-0 to 6-5 15.47 28.53 16.21 31.55 
5-6 to 5-11 13.40 16.68 13.85 17.09 
5-0 to 5-5 10.90 8.71 11.34 10.18 
4-6 to 4-11 4.14 3.80 444 5.13 
4-0 to 4-5 1.70 1.22 1.61 1.79 
34 to 3-11 1.04 Pe 1.04 ll 
3-0 to 3-5 ae a 01 se 
Below 3-0 we 02 a 03 
Total 100.01 99.99 100.00 100.01 
Mean Mental Age 66 6-2 6-5 6-1 
Number (N) 16,126 6,084 11,433 3,835 


fall pupils shows nearly twenty per cent with mental ages above seven years 
and five months, and the same test with spring pupils shows nearly seventeen 
per cent in the same category. Test “B” shows one per cent or less in either 
case in this mental age group. At the other end of the range, however, the 
differences are practically non-existent. Using five years and six months 
as a cut-off point, Test “A” shows nearly eighteen per cent in the fall and 
about the same proportion in the spring. Test “B” shows about fourteen 
and sixteen per cent in the fall and spring respectively. 
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Obviously, were the mean mental ages derived from one test to be 
“normalized” in terms of the other, these proportions would vary consider- 
ably, dropping at the upper end of the scale, and rising at the lower end. 
However, we are here concerned with what the teacher gets when she 
uses either test without reference to the other, and what the teacher three 
or four years hence will see on the cumulative record card when she exam- 
ines the recorded I.Q.’s for a pupil. 


Analysis by Districts 


At this point in the study it was deemed desirable to break down the 
data by districts. It has been reported previously that there seem to be 
fundamental differences between certain of the districts with reference to 
average ability and average achievement. The question now arose as to 
whether these differences in ability might not be due to the test being used, 
rather than to any basic intellectual qualities of the pupils concerned. 
Tables IV and V report certain of the data by districts. 











TABLE IV 
Beginning First Grade Intelligence Tests Data by Districts, Fall 1954 
Test A Test B 
District Number Mean IQ Number Mean IQ 
1 3,304 99 211 99 
2 2,829 100 914 100 
3 35 104 4,322 100 
4 6,166 110 177 101 
5 3,625 110 428 105 


Perhaps a district-by-district review of the data shown in Tables IV and 
V will serve to illustrate the discrepant nature of the findings. In District 
1 the I.Q.’s derived from either test on either occasion are identical, so far 
as district mean is concerned, although the numbers tested with Test “B” 
are considerably smaller than those tested by means of Test “A.” In the 
fall of 1954 only about six per cent of the District 1 pupils were tested 
with Test “B,” and in the spring this same proportion amounted to about 
seven per cent. 

In the case of District 2, about 24 per cent of the pupils took Test “B 
in the fall, and about 26 per cent in the spring. The mean I.Q.’s for the 
two tests were identical in the fall, but Test “A” yielded a mean I.Q. four 
points higher in the spring. 

In District 3, measurement was preponderantly by means of Test “B” 
on both occasions. In the fall about 99 per cent of the pupils took Test 
“B,” and in the spring the proportion was about 97 per cent. Here a 
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TABLE V 
Beginning First Grade Intelligence Tests Data by Districts, Spring 1955 


Test A TestB 





District Number Mean IQ Number Mean IQ 
1 2,249 100 173 100 
2 1,974 - 104 684 100 
3 16 112 2,732 102 
4 4,470 111 64 104 
5 2,663 112 178 106 








difference of four I.Q. points shows up for the fall group, and for the spring 
group it has risen to ten points. 

District 4 had about three per cent of its pupils tested by means of 
Test “B” in the fall, and about one per cent so tested in the spring. Here 
the differences in yielded I.Q.’s are dramatic, favoring Test “A” by nine 
points in the fall and seven in the spring. 

District 5 results are quite similar to those of District 4. The I.Q. 
differences in favor of Test “A” are five points in the fall and six points 
in the spring. Most pupils were tested by means of Test “A,” the propor- 
tions tested by means of Test “B” being about eleven per cent in the fall 
and about six per cent in the spring. 

All of the above findings serve more to confuse than to illuminate. 
Districts 4 and 5 are those which usually show to better advantage on 
achievement tests, and average I.Q.’s for other grade levels are usually 
somewhat higher in these districts than in the other three. This would 
appear to be borne out by the findings of this study. But how to account 
for the District 3 findings? True, the numbers tested by means of Test “A” 
are small, probably representing but one or two classes, and these conceiv- 
ably might have been in schools which tended to have brighter pupils. Yet, 
in the case of District 2 for the spring semester the numbers tested by 
means of both tests are quite respectable and there is a difference of four 
I.Q. points between the two tests. 


Conclusions Not Final 


Needless to say, no lasting conclusions have been drawn from this study. 
It is deemed probable that Test “A” yields I.Q.’s which are somewhat higher 
than they should be, and it is possible that Test “B” yields I.Q.’s somewhat 
lower than they should be (see Table I). All things considered, it seems 
likely that Test “B” is closer to being right than Test “A.” This conclusion 
is based on the purely empirical reasoning that it appears unlikely that 
numbers of pupils such as were tested by means of Test “A,” living in a 
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large metropolitan center such as Los Angeles, would average nearly one- 
half a standard deviation above the “normal” I.Q. of 100. Moreover, testing 
at higher grade levels tends to show a mean city-wide I.Q. of about 100. 

It is planned to continue the study this year. The next step is to 
administer both tests to the same pupils. This rather obvious procedure 
was avoided in the first instance because of its unwieldiness, because of 
the need for training additional teachers to administer the test with which 
they are unfamiliar, and out of a desire to keep the teacher burden as light 
as possible. The preliminary findings, however, appear to warrant further 
exploration of the situation, with direct comparisons appearing to offer the 
better chance of usable results. 


CERA ANNUAL MEETING 


The California Educational Research Association will hold its Thirty- 
fourth Annual Meeting at the Hacienda Hotel, Fresno, March 23 and 24, 
1956. Research workers interested in the problems of education have been 
invited to attend—college and university faculty members, professional 
research workers in education, school teachers, counselors, administrators, 


and others. The meeting is being held in Fresno this year in the hope that 
more people from the southern part of the State will be able to participate. 
A special invitation has also been extended to research workers of our 
neighboring states. For further information one should contact the Secre- 
tary-Treasurer, Miss Hazel M. Lewis, 324 N. San Joaquin St., Stockton 2, 
California. 





The California State Department of Education has recently issued an attractive 
booklet titled “Teaching Load in California Public Schools.” This publication 
consists of three sections which originally appeared as articles in California 
Schools. The studies are based on data secured from 41,781 full-time teachers in 
a survey made in April, 1950. 


ee ii 


PRS 


Sb RAD Be IGA 


A 


SRR aioe 


ited, 


SPARES Sie a Se tess 


pater 


repeated 





i dl 


tive 
ion 
nia 
3 in 





ony 


Ry PRT Fa 
ot ae ars 


SHRI ERIS 


Relationships Between Norms for 


Mental Maturity and Achievement Tests 


WILLIAM M. SHANNER 


The interpretation of achievement test performance in relationship to 
performance on mental maturity tests is a complex problem. Some of the 
complexity is due to the fact that the assumptions and techniques followed 
in norming maturity tests are not applicable for norming achievement tests. 
Unless these basic differences in norming procedures are considered in 
interpretation of test results, incorrect generalizations may be made. The 
old situation of “bright children under achieving and dull children over 
achieving” is a case in point. 

Let us look at the norming procedures. To simplify the discussion we 
will consider only the most widely used norms, namely grade placement 
equivalents for achievement tests and age norms for mental maturity tests. 
Establishing mental age equivalents for a mental maturity test involves 
grouping all persons in the norming group by chronological age and com- 
puting the test score representative for each age group. Thus a given test 
score indicates a level of performance typical of a certain chronological 
age. For achievement tests the grouping is by school grades. Here a 
given test score indicates a level of performance typical of a certain school 
grade. By statistical procedures one then develops mental age or grade 
placement equivalents for the entire ranges of the test scores. 

The important concept, however, is that at one time we are grouping 
persons by chronological age and at the other time we are grouping persons 
by school grades. This situation is similar to what occurs in interpreting 
test results in an actual situation. A pupil is in a certain grade but we 
wish to interpret his achievement in terms of his mental maturity (or mental 
age) characteristics. 


William M. Shanner is Director of Professional Service of the California Test 
Bureau, a position he has held for the last tuo years. His previous experience 
includes service as Examiner, Board of Examinations, University of Chicago 
(1938-39), Assistant Professor of Education, University of Chicago (1945-51), 
Assistant Chairman, Department of Education, University of Chicago (1947-51), 
and Executive Director, University of Oklahoma Research Institute (1951-53). He 
received his doctorate from the University of Chicago in 1944. This article is a 
discussion of data presented in the manuals for the California Achievement Test 
and the California Test of Mental Maturity. 
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What implications does this have? We can analyze this problem to 
some degree by considering our own experiences. Consider all ten-year-olds. 
We know that they will have a wide range of mental ability and a wide 
range of achievement. Now consider all fourth-grade youngsters. These 
youngsters will in most cases all be more alike as a group with respect to 
mental ability and achievement than are the ten-year-olds. Here we can 
change our definition of a fourth grader by administrative procedure and 
make our group as homogeneous as we wish. Retardation and acceleration 
will have taken place to some extent in all school situations so that school 
groupings are more homogeneous than age groupings. 

What I have tried to point out is that the ranges of mental ability and 
achievement are less when we group persons by school grade than when 
we group persons by chronological age. Thus we find that the units estab- 
lished for norms where grouping is by age are not directly comparable on 
a one-to-one basis with units established for norms where grouping is by 
grade. This point can be established empirically from data published in 
manuals of achievement and mental maturity tests. 

Table I summarizes data published in test manuals in this fashion and 
reports the distributions of achievement test and mental maturity test norms 
expressed in comparable units. The data in the table were derived from 
the data published in the manuals for the California Achievement Test (1) 
and the California Test of Mental Maturity (2). The following procedure 
was followed: 

1. Data from the percentile norms tables in the Manuals for the Califor- 

nia Achievement Test were plotted on probability paper. Grade 

placement units were plotted against percentiles. The elementary level 
of the test was used for Grades 5 and 6 and the intermediate level of 
the test was used for Grade 7. The reason for using two levels of the 
test was to work the illustration over a range where the articulation of 
two levels of the same test would be involved. The median time for 
testing computed from the percentile norms tables would be grades 

5.3, 6.3, and 7.3 respectively. 

2. Percentage frequencies were read from the probability plot for each 

one-half year grade placement interval above and below the norm 

(the norm being the 50th percentile for testing at 5.3, 6.3, and 7.3). 

These values are reported in Table I together with a composite (average) 

for the three grades. 

3. The testing times of grades 5.3, 6.3, and 7.3 were translated to 

chronological age equivalents of 127, 140, and 152 months respectively. 

These equivalents are contained in the Manual for the California 

Achievement Test. 

4. The percentile norms for the total mental factors score for the Cali- 


fornia Test of Mental Maturity for these ages of testing were plotted on © 


probability paper, The raw scores in the percentile tables were changed 
to intelligence grade placement equivalents and these units were used 
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in the plots. Here again the elementary level of the test was used for 

chronological ages of 127 and 140 and the intermediate level for testing 

at 152 months. 

5. Percentage frequencies were read from the probability plot for each 

one-half year intelligence grade placement interval above and below the 

norm (the norm being the 50th percentile for testing at 127 months 

or intelligence grade placement of 5.3, 140 months or 6.3, and 152 

months or 7.3). These values are reported in Table I together with a 

composite (average) for the three testings. 

The composites for achievement and mental maturity are plotted as 
histograms in Figure 1. From Table I and Figure 1 it is easy to see that 
the range in mental maturity is much greater than the range of performance 
in achievement when both are plotted with the same grade placement units. 
Let us review what has been done. Only data presented in the test manuals 
have been used. The mental maturity test was standardized in age units. 
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FIGURE 1 


Percentage Frequency Histograms for Total Achievement Scores and Total Mental Factor Scores as 
Grade Placement Deviations from the Norms for Grades 5, 6, and 7 combined. 
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The grouping here was by chronological age. The achievement test was 
standardized in grade units. The grouping here was by school grade. 
Intelligence grade placements were assigned to the mental maturity test by 
equating mental ages to corresponding chronological ages for the various 
grade groups. The results are Table I and Figure 1. When norming 
units are used, we find that at any particular grade level, the range in the 
mental ability of the youngsters is greater than the range in their achieve- 
ment performance. It might be pointed out that this solution is not unique 
to the California Achievement Test and the California Test of Mental 
Maturity but is applicable to all achievement and mental maturity tests 
normed in grade placement and age units respectively. Differences that 
would exist in other test situations would be due to sampling errors in the 
different norming populations, This discussion relates to the general situa- 
tion and, while data from the California tests are used, it is intended as a 
general discussion. To discuss the general point was one of the reasons for 
using data from two different levels of the various tests as well as combining 
results from three different testing ages and grades. 

We now return to the problem of the interpretation of achievement test 
performance in relationship to performance on mental maturity tests. If 
we are to use the results of mental maturity tests to establish an “expectancy 
level” of achievement test performance, we must correct for the underlying 
difference in units. If we do not make this correction and instead use the 
intelligence grade placements directly, the expectancy for bright youngsters 
will be too high and the expectancy for slow youngsters will be too low. 

Statistical computation indicates that the standard deviation in grade 
placement units for the mental maturity test for a particular grade is 1.89 
grade placement units. For the achievement tests the standard deviation 
is 1.35 grade placement units. (The standard deviations can be estimated 
by plotting the composites of Table I on probability paper and using the 
various values at selected percentiles in statistical formulae.) Thus we 
find that the ratio of the deviation unit for achievement tests to the deviation 
unit for mental maturity tests is only 0.715. 

The following formula can be written concerning the expected achieve- 
ment (Ex) of a student in terms of his intelligence grade placement (I.G.P.). 
The symbol N indicates the norm for the grade in grade placement units 
at the time of testing. 


Ex =N + .715 [1.G.P.—N] 


This formula has been applied to the mental maturity data in Table I. 
The Ex values obtained have been plotted in Figure 2 along with the 
composite of actual achievement results. These two curves agree very 
closely. By inspection it can be seen that the proposed correction applied 
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to the intelligence grade placement results for the mental maturity test 
results gives an expected achievement (Ex) that corresponds closely to 
actual achievement. 
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FIGURE 2 


Percentage Frequency Curve for Total Mental Factors Scores Adjusted to Common Units and 
Superimposed over Percentage Frequency Curve for Total Achievement Scores for Grades 5, 
6, and 7 Combined. 


One might suggest that the intelligence grade placement values should 
be corrected in the tables in the manuals of the test for the foregoing condi- 
tion. However, this cannot be done. Norms are based upon the average 
or median performance of various groups classified homogeneously with 
respect to age or grade or some other variable. We find that no correction 
is necessary at this median or central point. Norming procedure does not 
provide for an estimate of deviations in individual cases. Norming procedure 
assumes that the performance of all ten-year-olds may be represented by 
a single score. If a ten-year-old does not make this expected score then we 
describe his score as being that of a typical eleven-year-old or whatever is 


appropriate. ~ 
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Let us illustrate by an example that shows that corrections must be made 
in individual cases. Consider an individual with an intelligence grade 
placement of 6.3. If he is in the fifth grade and is being tested at 5.3 his 
Ex is 6.0. If he is in the sixth grade and is being tested at 6.3 his Ex is 6.3. 
If he is in the seventh grade and is being tested at 7.3 his Ex is 6.6, This 
illustration demonstrates that corrections must be made in each single case. 

The foregoing discussion has attempted to analyze one of the factors 
underlying the major problem of interpreting achievement test performance 
in relationship to performance on mental maturity tests. It has been pointed 
out that in part this problem is statistical in character because the units 
underlying the two sets of test scores are inherently unequal and not directly 
comparable. A simple formula has been proposed for estimating expected 
achievement in grade placement units from intelligence grade placements. 
All necessary data for developing this formula are reported in manuals of 
the tests. 


REFERENCES 


1. Ernest W. Tiegs and Willis W. Clark, aa aris Achievement Tests. 
Los Angeles: California Test Bureau, 1951. Pp. 3 


2. Elizabeth T. Sullivan, Willis W. Clark, and Ernest . Tiegs, Manual, California 
Test of Mental Maturity. Los Angeles: California Test Bureau, 1951. Pp. 28. 


The Sonoma County Schools Office has just released a summary of the results 
of the 1954-1955 testing program in the county. The study was prepared by 
Carmen Finley, Supervisor of Research and Publications. In addition to being a 
very comprehensive report of the status of students in the elementary schools of a 
county that may be considered to be typical for California, the publication also 
contains several innovations in techniques which should prove of interest to 
educational researchers. Inquiries should be addressed to the Office of the Super- 
intendent, Sonoma County Schools, 1593 Cleveland Avenue, Santa Rosa, California. 
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Effect on Reading Achievement of 
Under - Testing Pupils in Low Third Grade 


FLORENCE EDNA WARE 


In compiling yearly survey data in Los Angeles during recent years, there 
has been growing concern among educators over the apparently low achieve- 
ment in both the vocabulary and comprehension areas of reading in the 
Third Grade. This concern has prompted studies to be undertaken to 
ascertain possible causes. 

Delayed reading, inadequate testing, age norms and other related prob- 
lems are among some of the possible causes being explored at the present 
time. 

The Gates Primary Reading Tests, Type I and Type II customarily have 
been used in the Low Third Grade with the Gates Advanced Reading Tests, 
Type I and Type II being used for a few of the more advanced pupils. 
This practice has been followed even though the Primary Form was designed 
for use in Grades One and Two. Such usage may have developed out of a 
delayed reading program. 

Investigating the possibility of inadequate testing, a survey of the 
achievement of 3,176 unselected Low Third Grade pupils in the Valley 
District tested with the Gates Primary Reading Test, Type I and II, revealed 
that 903 reached the ceiling vocabulary score in Type I, Word Recognition, 
and that 637 reached the ceiling comprehension score in Type II, Sentence 
Reading. The incidence of ceiling scores ranged as high as nineteen for a 
single class. 

Further exploring the possibility of inadequate testing, several classes 
were tested with Gates Primary Reading Tests and retested with Gates 
Advanced Primary Reading Tests. 

One typical Low Third Grade class of 33 pupils so tested is reported 
here. The grade norm for this class was 3.1 with a normal age range 
averaging 8-1 and an average I.Q. of 102. 

In the vocabulary test, 18 of the 33 pupils reached ceiling norms of the 
Gates Primary Reading Test, Type I, and when given the Gates Advanced 


Florence Edna Ware is Supervising Counselor of the Valley District of the 
Los Angeles City Schools, a position she has held for three years. She was pre- 
viously an elementary teacher in Detroit and in Los Angeles, and a school counselor 
for six years. She holds a bachelor’s degree from Los Angeles State College and 
has also studied at Wayne University, the University of California at Los Angeles, 
and the University of Southern California. 
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TABLE | 


Comparative Achievement of 33 Pupils on Gates Primary Reading and 
Gates Advanced Primary Reading Tests 


Gates Primary Reading Tests Gates Advanced Primary Reading Tests 


Pupil Typel Type II Typel Type II 
1 3.3 3.5 6.0 64 
2 3.3 3.5 5.3 6.4 
3 3.3 3.5 4.6 5.4 
4 3.3 a 4.6 4.2 
5 3.3 3.5 5.0 3.8 
6 3.3 3.4 4.6 4.2 
7 3.3 3.4 3.6 3.8 
8 3.3 3.4 2.7 3.3 
9 A 3.3 4.6 4.8 

10 3.3 3.3 3.9 4.2 
11 33 3.3 a 3.8 
12 3.3 3.2 5.5 6.8 
13 3.3 3.2 3.5 3.7 
14 3.3 3.1 3.4 3.1 
15 3.3 3.0 4.3 3.3 
16 3.3 3.0 37 6.0 
17 3.3 2.7 2.9 2.9 
18 3.3 2.7 3.0 21 
19 3.2 2.7 3.1 3.0 
20 3.1 3.0 2.7 2.3 
21 3.1 2.7 2.7 2.8 
22 3.0 3.4 2.3 2.8 
23 3.0 24 2.6 2.3 
24 3.0 3.0 2.0 2.0 
25 2.8 2.7 2.8 2.3 
26 25 2.9 2.5 2.6 
27 2.2 2.1 2.4 2.0 
28 2.0 2.2 2.4 2.1 
29 2.0 15 Li 2.0 
30 1.9 1.6 2.0 2.0 
31 2.3 1.6 1.7 1.9 
32 2.0 2.2 Ee 18 
33 1.9 1.6 18 1.7 








Reading Test, Type I, 15 pupils did as well or better with an average 
improvement of 11 months. On the retest 3 pupils did less well with an 
average loss of 4 months. 

The results showed that for the 18 cases the net gain was 9 months, 
and the entire class made an average gain of 13 months on the Gates 
Advanced Primary Reading Test, Type I, over the Gates Primary Reading 
Test, Type I. 
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Results of the comprehension test showed that 5 pupils reached ceiling 
norms on the Gates Primary Reading Test, Type II, and were given Gates 
Advanced Primary Reading Test, Type II. In all 5 cases the pupils made 
an average improvement of 17 months. No ceiling cases on the Gates 
Primary Reading Test, Type II, showed a loss when retested with the Gates 
Advanced Primary Reading Test, Type II. 

Averages for the entire class in vocabulary were 3.0 for the Gates 
Primary Test, Type I, and 3.3 for the Gates Advanced Primary Reading 
Test, Type I, a total average gain of 3 months; whereas, in comprehension 
on the Gates Primary Reading Test, Type II, the average for the class was 
2.9 and on the Gates Advanced Primary Reading Test, Type II, it was 3.4, 
a gain of 5 months. 

The changes for the individuals in the class are shown in Table I. It 
will be observed in Table I that there was a tendency for pupils who 
initially scored high on the Primary forms of Gates to make a greater gain 
on the second testing with the Advanced forms of Gates. Conversely, many 
of the low achievers on the Gates Primary actually lost when retested with 
the Gates Advanced. This suggests the fact that it is the better readers 
who are under-tested with the Primary forms of Gates. It points out the 
fact that poor readers in the Low Third Grade might well be tested with 
Gates Primary Reading Tests. 

The conclusions from this study are: 

1. That for most pupils, in the population tested, the Gates Advanced 
Primary Reading Tests gave a better and probably truer picture of their 
actual reading ability. 

2. Only a few pupils of markedly low reading ability in Low Third 
Grade tended to do better on the Gates Primary than on Gates Advanced. 

3. Current testing practices need to be revised in order that achieve- 
ment test results represent a truer picture of actual pupil and school 
accomplishment. 


Teachers, administrators, and others interested in the economic side of teacher 
welfare will be interested in Bulletin No. 1180 of the United States Department of 
Labor, titled “Digest of One Hundred Selected Health and Insurance Plans Under 
Collective Bargaining, 1954.” This data on so-called fringe benefits should also 
prove of interest in connection with personnel policies applicable to non-certificated 
employees. 
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Development of Achievement Test Norms 
Differentiated for Age and Intelligence 


ALFRED S. LEWERENZ 


The use in the Los Angeles City Schools of achievement test norms 
constructed for increments of chronological age and intelligence dates back 
to the summer of 1934. At that time the writer visited Carleton Washburne, 
then superintendent of the public schools in Winnetka, Illinois. Among 
other administrative devices, he was utilizing a table of expected achieve- 
ment test results in terms of pupil age and intelligence for estimating pupil 
learning goals. The essential idea was that, given two pupils of the same 
age, the one with a higher intelligence quotient would be expected to achieve 
at a higher level. These expectancies were in gross units and the whole 
table was on one sheet of 8% x 11-inch paper. 

Upon the writer’s return to Los Angeles, a project was set up in charge 
of Charles Hasemeyer to build more detailed tables using local data. 
Approximately 5000 elementary school pupil personnel cards were utilized 
for grades one through six. Hand sorts were made by intervals of five 
points of I.Q. and six months of chronological age, without regard to the 
grade placements of the pupils. Cards were distributed by. the above 
intervals for reading vocabulary, reading comprehension, arithmetic reason- 
ing, arithmetic fundamentals, language and spelling. As the cards for each 
subject were sorted, the means for the achievement at each interval were 
found. It should be emphasized that this sorting was done for the group 
as a whole, with no distinction being made as to grade. Pupils representing 
several grades might be found in each interval. The only controls were 
the ages of the pupils and their I.Q. levels. On completion, the six tables 
were checked for agreement. It was found that the averages for reading 


Alfred Speir Lewerenz is Director of the Evaluation and Research Section of 
the Los Angeles City School Districts. Although given his present title only two 
years ago, Dr. Lewerenz has long been engaged in research and testing in the Los 
Angeles Schools as Assistant Director of Research and Guidance and as Head 
Supervisor of Evaluation. He has been particularly concerned with the construc- 
tion and use of tests for guidance purposes. He received the doctor of education 
degree from the University of Southern California in 1937. This article is based 
upon original work done in the Los Angeles City Schools. 
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vocabulary and reading comprehension agreed fairly well, but that those 
for the other four tests showed considerable differences. It was decided 
that the averages for reading comprehension were perhaps the best single 
set of data for estimating pupil attainment, so tables based upon these data 
were prepared, On each page were columns by six month intervals of 
chronological age and rows by five points of I.Q. In each cell was given 
the average achievement for that age and I.Q. Thus it was possible for the 
first time to have a norm for achievement of a twelve-year-old pupil with 
80 I.Q. or a ten-year-old pupil with 130 1.Q. 

In 1935, and for several years following, these differentiated norms were 
utilized as a basis for studying the achievement levels attained by pupils 
of varying ages and intelligence levels. The basic idea of a pupil working 
at, above, or below expectancy was developed. 


The X.A. Tables Reduced to Formula 


During the period of 1936 and 1937, Alice McAnulty (Horn), a member 
of the Department of Educational Research and Guidance of the Los 
Angeles City Schools, was working on her doctoral dissertation at the 
University of Southern California. As part of her study she analyzed sta- 
tistically the above tables and reduced to a series of formulas the relationship 
between school achievement and the age and intelligence of the pupil.’ 
As a result of this work, tables of expected achievement ages based upon 
age and I.Q. were developed to extend beyond the original tables, both as 
to age and intelligence. These tables of expected achievement are in terms 
of educational age. 

From time to time grade placement equivalents have been assigned to 
the expected achievement ages for purposes of easier interpretation in 
connection with test results. As promotion policies have changed and state 
laws regarding age of entrance to grade B-1 have been revised, it has been 
necessary to change the grade placement values assigned to the original 
age data. 

In the fall of 1955 the Evaluation and Research Section of the Los 
Angeles City Schools prepared a new edition of the X.A. book utilizing Dr. 
Horn’s basic tables of expected achievement ages. The revision was neces- 
sitated by a lowering of the average age for grade in the Los Angeles City 
Schools since the last revision was made in 1947. The grade placement 
values now assigned to the ages are what may be termed “ideal.” They 
are based on the assumption that pupils who are six years and no months 
old will, on the average, be in grade 1.0; seven years and no months will 


1 Alice McAnulty Horn, Uneven Distribution of the Effects of Specific Factors, 
Southern California Education Monographs, Number 12, 1941. 
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see them in 2.0; at 8-0 they will be in 3.0, etc. In Table I will be seen 
a comparison of the actual grade for age together with the ideal grade 
according to fall, 1954 data. 


TABLE | 
Derivation of Actual Grade Placement from Ideal Grade Placement 
-—— Chronological Age —— Actual Ideal Deviation 
Years & Mos, Mos. Grade Grade from Ideal 

6-2 74 1.0 1.2 —2 
8-1 97 3.1 3.1 0 
8-8 104 3.6 S7 ot 
98 116 4.6 4.7 -l 
108 128 5.6 5.7 -l 
11-3 135 6.1 6.3 —2 
11-8 140 6.6 6.7 -1 
12-1 145 7.1 7.1 0 
12-7 151 7.6 7.6 0 
13-1 157 8.1 8.1 0 
13-8 164 8.6 8.7 -l 
14.2 170 9.1 9.2 -l 
148 176 9.6 9.7 -l 
15-3 183 10.1 10.3 -2 
15-9 189 10.6 10.8 -2 
16-4 196 11.1 11.3 -2 
16-9 201 11.6 11.8 -2 
17-3 207 12.1 12.3 -2 
17-9 213 12.6 12.8 -2 


Los Angeles pupils, when not at “ideal” age for grade, are within one 
or two months of it, with the trend continuing toward “ideal” ages. There- 
fore, “ideal” ages have been used as grade placement equivalents for 
expectancy ages in the basic X.A. tables. Since the average I.Q. for Los 
Angeles City is about 100, the expected achievement at 100 I.Q. is that 
given for the “ideal” column above. Pupils somewhat above or below 
the average I.Q. have an X.A. varied according to the basic formula of 

2MA+CA 
XA = ————_—_. 
3 


Chronological Retardation in Test Norms 


It has been noted that the expected achievement tables are used without 
any reference to the pupil’s actual grade. However, the achievement level 
for most achievement tests is given in terms of grade placement. In order 
to properly utilize a nationally standardized achievement test it is essential 
that the grade placement equivalents for educational ages given in the test 
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norms be approximate to or equal to the age-grade relationship of the 
expected achievement grade placement norms at I.Q. 100. If this is not 
the case, it is like using instruments with non-standardized calibrations. In 
checking the grade placement equivalents on the achievement batteries 
currently in use in the Los Angeles City Schools, it was found that there 
was a considerable difference between any one grade placement equivalent 
given in the test norms for a given age and the actual age-grade relationship 
in Los Angeles city. For example, even at the elementary grade levels, 
there is considerable chronological retardation inherent in the grade place- 
ment equivalents provided in the test norms as shown in Table II. These 
differences constitute a “penalty” to the extent shown at the right. 


TABLE II 
Chronological Retardation Inherent in Grade Placement Equivalents 


“Penalty” 
Age in L. A. Actual Achiev. Test Norm. 
L. A. Grade Placement Grade Placement Difference 


97 mos. = a -3 mos. 
116 mos. J . —2 mos. 
151 mos. J a —4 mos. 


There seems to be no question but that the achievement age norms 
supplied by test makers over a period of years, are a truer evaluation of 
performance than are the grade placement values, because chronological 
age is a “native” factor in determining pupil achievement, whereas the 
grade a pupil sits in at any given age is subject to promotion policy and 
laws governing age of entrance. Ideally, it would be desirable to compare 
both expected achievement and actual test achievement on the basis of age. 
However, since age expressed either in years or years and months is a 
difficult measure for teachers to keep in mind when relating it with test 
results, the equivalent in terms of grade placement value is a handy tool. 
Therefore, to correct for over-ageness in the present test norms, it is neces- 
sary to substitute the current age-grade relationships for the older values 
which now contain much chronological retardation because of the trend 
in school systems toward reducing over-ageness in the grades. Such a 
revision was recently made by the test publishers so that beginning with 
the fall 1955 testing program in Los Angeles city, it is possible to have 
both achievement test grade placement norms and expected achievement 
grade placement values for 100 I.Q. in terms of the same age-grade values. 
No change in educational ages for raw score has been made. 
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Analysis of Data by Age and I.Q. 


As a prelude to checking the 1955 X.A. norms a study was made of 
current achievement data. This called for the preparation of tables of actual 
achievement based upon age and intelligence quotient for each of the 
subjects measured, following the same plan as given in the first part of 
this article. Utilizing for the most part the fall 1954 elementary school 
data and the spring 1955 junior high school data, corrected for the new 
grade placement equivalents, it was possible to make a comprehensive 
study of the elementary and intermediate forms of the achievement battery 
used. Data were sorted by means of I.B.M. equipment in five point intervals 
for I.Q. and six months intervals for chronological age without regard to 
grade. Each interval was termed a “cell” and each cell was given a number 
so that the performance at any given age and I.Q. level could be discussed 
with the aid of an identification number. 


The first sort made was to determine the number of cases that would 
fall within each cell in order to get an idea of the nature of the distribution. 
Subsequently, trial tabulations were made of certain cells on a pattern basis. 
One trial utilized the 98 to 102 intelligence level and sorts were made for 
each age group at that level. Another sort was made for cases running 
diagonally from lower left to upper right of the cell chart which represented 
maximum increments of educational growth as the diagonal ran from the 
young-dull corner (lower left) to the old-bright group (upper right). 


Conversely, another sort was made on the opposite diagonal which may 
be termed the diagonal of minimum growth, since the cells ran from the 
young-bright (upper left) to the old-dull (lower right). The means on these 
three sets of cells revealed sufficient numbers of cases and such significant 
differences in means as to warrant making a study of all the cells on the 
chart, some 460 in all. 


On completion, it was found that cell data for the elementary achieve- 
ment test battery totaled 110,553 cases and for the junior high pupils utiliz- 
ing the intermediate battery, there were 50,436 cases. The data for the 
two types of tests were handled as separate studies. The number of pupils 
in any one cell ranged, in the case of the elementary battery, from one to 
2,768 and for the junior high school battery from one to 1,074. Naturally, 
few if any cases were recorded for old-bright pupils and for very young- 
dull pupils for such would be manifestly out of adjustment chronologically. 
The age range for the elementary pupils, since none was tested below 
grade B-3, was from 6-9 to 14-8. At the junior high school level the age 
range was from 10-3 to 17-2, as these pupils were tested in grades 7, 8 
and 9, the I.Q. range for both groups was from 58 to 153+. 
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Findings with regard to these tables based upon cells of six months of 
age and five points of I.Q. are: 


There is a regular increase in test performance with increasing 
increments of age and I.Q. 


At the 100 I.Q. level a six months increase of chronological age 
brings five months or less growth in achievement. 


Where test results for elementary and intermediate forms overlap 
in the same cells, the pupils tested by means of the intermediate 
form show a higher average achievement level. Put another way, 
elementary pupils of a given age and intelligence tested with an 
elementary achievement battery do less well than do junior high 
school pupils of the same age and intelligence tested with an inter- 
mediate battery. (See chart 1.) 

Some hypotheses regarding these differences in performance of elemen- 
tary and junior high pupils of the same age and intelligence are: 

The junior high groups perhaps are better and harder workers 
since they succeeded in getting ahead. They achieved better on the 
tests because they are “go-getters.” 

Junior high pupils will have been exposed to more advanced 
instructional content and therefore will have a better background of 
knowledge to bring to the test situation. 

Since pupils in grade A-6 were near the top end of the elemen- 
tary test and the B-7 junior high pupils were at the beginning of 
the intermediate test, there would be more opportunity for guessing 
on the part of the latter pupils. 


Checking by Comparing with Actual Achievement 


A further study was made of the achievement of pupils at different I.Q. 
levels and at different ages in comparison with expectancy, using 1955 
age-grade norms. For each cell the deviation of the achievement average 
above or below expectancy was figured on each of the six tests at both the 
elementary and junior high levels. These tables of deviations from expect- 
ancy, according to age and intelligence, show definite patterns of pupil 
performance: 


1. In general, the pupils of low ability were working above 
expectancy; 


. Pupils over-age for their grade tended to be working below 
expectancy, regardless of intelligence level; 


. The very young pupils were working above expectancy regard- 
less of intelligence level; 





January, 1956 ACHIEVEMENT TEsT NorMs 


RANGE IN MEAN AGES GRADES B-7 -- A-9 


PEER IL LT | 
PLE TT 
Le 
bite aaah 

teu 


I, Q. GROUP 98-102 


COMPARED TO THEIR EXPECTED ACHIEVEMENT 
CONTROL IS ON AGE 


RANGE IN MEAN AGES GRADES B-3 --A-6 


4 
a 
a 
o 
~ 
a 
& 
Zz 
a 
io] 
oO 
< 
a 
a 
12) 
Q 
2 
o 
fs 
a 
Zz 
& 
= 
a 
= 
° 
oO 
oO 
4 
2 
& 
a 
_ 
a 
R 
w 
< 
FE 
oO 
< 
Zz 
3 


Where data overlap Mean Age for larger group is used. Ages are in months. 


* 





CALIFORNIA JOURNAL OF EDUCATIONAL RESEARCH Vol. VII, No.1 


The more intelligent pupils were working below expectancy; 


Pupils in the I.Q. range of 98 to 102 were working at or’ near 
expectancy within the age range normal for the grades in which 
the tests were given. 


This last finding was highly significant in that it showed that average 
pupils were working at the same level of achievement as pupils of the same 
age in the standardization population which could be assumed to have an 
average I.Q. of 100. 

The observations of this phase of the study, made by Philip Nash of 
the Evaluation and Research Section staff, seem to verify the fact that the 
younger children as well as the dullest children were tending to work up 
to expected achievement level but that the relatively older pupils, regardless 
of intelligence level, were tending to work below expectancy. The fact that 
the younger ones tend to over-achieve and the older pupils tend to under- 
achieve may be due in part to regression factors within the testing instru- 
ments. Again, it may be due in part to teacher concern over the performance 
of low achievers and the absence of enrichment materials for the superior 
pupils. The fact that these bright pupils are over-age may in part be due 
to the fact that they are considered problem cases and have been retained. 


As pointed out earlier, the original expectancy tables were based upon 
pupil accomplishment in reading comprehension, and, as would be expected, 
pupil achievement in this study more closely approximated expectancy in 
reading comprehension than in any of the other five subject fields, i.e., 
reading vocabulary, arithmetic reasoning, arithmetic fundamentals, language 
and spelling. 


An examination of the accomplishment of pupils at the two extremes 
of the I.Q. range indicated that both the gifted and retarded pupils tended 
to approximate their expectancy in reading comprehension, whereas in the 
five remaining subject fields the gifted pupils usually failed to reach expect- 
ancy, and the retarded pupils usually exceeded expectancy. The regression 
relationship between I.Q. and accomplishment was taken into consideration 
by Dr. Horn in her expectancy tables, and as judged by the empirical 
findings of this study, the X.A. tables are exceptionally predictive so far as 
reading comprehension is concerned. 


It would appear,however, that the apparent regression effect is of much 
greater weight in arithmetic reasoning, arithmetic fundamentals, language 
and spelling than it is in the case of reading comprehension. The fact that 
reading comprehension depends less than any of the other five subject fields 
upon school units of instruction may influence this effect. School learning 
units that are common to all, particularly in arithmetic and language skills, 
would tend to prevent the assumed spread in achievement between the 
gifted and the retarded. The acquisition of reading comprehension skill 
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lends itself to individual learning rates both in school and out, and conse- 
quently the speed of learning in this subject field depends to a larger extent 
upon pupil learning ability—i.e., intelligence, and individual effort in out- 
of-school reading. 

An examination of pupil achievement at any I.Q. level usually reveals 
that there are relatively small groups of younger pupils who are achieving 
above XA and also relatively small groups of older pupils who are achieving 
below X.A. Wherever the number of pupils in any I.Q.-age cell includes 
the total school population, the achievement of the pupils more closely 
approximates their expectancy. These relatively small groups at the extremes 
of age possibly represent pupils who have been accelerated or retarded 
for the. apparently good reason that their school achievement is at a con- 
sistently high level or at a consistently low level. 

Several educational and psychological factors that may help to explain 
this pattern of achievement are: 


1, These groups may be under-tested by the use of a measure not com- 
mensurate with their actual ability. 

2. These groups may represent a large number of pupils of relatively 
high or low educational diligence who in fact are over-achieving or under- 
achieving in terms of their own expectancy. 


3. Among the older pupils, regardless of intelligence level, who are 
under-achieving, there may be a substantial number of pupils whose achieve- 
ment is retarded by reason of faulty promotion practice. In other words, 
these pupils may be retarded in the grades to the point that they have not 
had presented to them units of learning normal for their ages and I.Q. levels. 

4, These high achievers and low achievers may represent, to some 
extent, pupils whose I.Q.’s are in error. An I.Q. that was under-rated may 
lower the expectancy of the pupil to the point that he over-achieves easily 
and an over-rated I.Q. may increase the expectancy of the pupil to the 
point that under-achievement is inevitable. In this connection it should 
be remembered that the standard error of group intelligence tests is at least 
five points of I.Q. 

A statement of a pupil’s I.Q. therefore does not represent an exact point 
on a scale but a limited distance along a continuum. His “true” 1.Q. may 
vary above or below the stated figure by several points, and, if it had been 
used, would have given a more correct X.A. 

In those 1.Q.-age cells where there is represented the total population 
it may be assumed that a cross section of the pupils is present and chance 
errors in I.Q. above and below “true” figures would cancel themselves out. 
The fact that achievement and expectancy are more likely to be approxi- 
mately equal in these cells could indicate that the averaging of errors is 
a fact. 


At the extremes of age in either the elementary or junior high school 
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levels the I.Q. and age cells include a decreasing number of pupils. Because 
of their small numbers, it can not be assumed that such pupils represent a 
cross section of pupil population at that particular level. The younger groups 
at each school level may contain pupils whose I.Q. was in error to a nega- 
tive degree, and older groups may contain pupils whose I.Q. was in error 
to a positive degree. The pupils that would balance out these errors are, 
for the younger pupils, in non-represented classes or schools of lower educa- 
tional level, and for the older pupils, in non-represented classes or schools 
of a higher educational level. 


Data Compared by I.Q. Levels and by Subject 


Another approach to the study of pupil performance on the basis of age 
and intelligence was to control on intelligence and determine the average 
achievement level by increments of age with relation to deviation from 
expectancy. For each of the six subjects and for a total of eight intelligence 
levels, graphs were drawn using a sampling of I.Q. levels. Intelligence 
levels involved were for those groups whose mean I.Q.’s were 70, 80, 90, 
100, 110, 120, 130, and 140. On each chart was given the expectancy 
curve for the I.Q. level and age intervals concerned. Super-imposed were 
the graph lines for pupil performance on the elementary and intermediate 
batteries. Inspection of these forty-eight charts reveals the following trends 
as reported by Mr. Nash: 


In general, younger pupils tend to work above expectancy. 
Where elementary and junior high pupils of the same age and I.Q. were 


tested with an elementary or an intermediate battery, the latter always gave 
the higher averages at all I.Q. levels. 

An analysis of the data by subject tests revealed the following trends: 

1. Reading Vocabulary. In the case of results for reading vocabulary 
it was found that with elementary school pupils those with above average 
I.Q. tended to work at or above expectancy, whereas pupils of average or 
less intelligence tended to work below X.A. This fact would seem to indicate 
that superior pupils are able to reach expectancy either through in-school 
or out-of-school experiences. In the case of junior high pupils, achievement 
is below expectancy at all I.Q. levels. This latter situation is so striking 
that some systematic cause seems to be operating. Studies are being made 
in an effort to run down the factor or factors causing this low performance. 

2. Reading Comprehension. It was found that on the elementary test 
for reading comprehension pupils tended to be at, or close to, expectancy 
at all I.Q. levels. However, the pupils of 138-142 I.Q. tended to be defi- 
nitely below X.A., but the pupils in the 68-72 I.Q. group tended to be 
definitely above X.A. Such a situation may be due to teacher concentration 
on helping low ability students, to the disadvantage of the very superior 
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pupils who actually may be working well above grade norm. In any event, 
the pupil of very low ability does not seem to be penalized by the X.A. 


In the case of the junior high group, the average and above-average 
pupils tended to be reading above expectancy. 


3. Arithmetic Reasoning. In general, in the case of both the elementary 
and junior high groups, the above-average pupils were working below 
expectancy. Conversely, the average and below-average pupils were achiev- 
ing in arithmetic reasoning at or above expectancy. 


4, Arithmetic Fundamentals. Both the elementary and junior high 
school arithmetic fundamentals test results showed a similar trend but 
different in degree. At the elementary level there was a decided trend from 
low achievement with regard to X.A. among the high I.Q. pupils to high 
achievement in comparison to X.A. on the part of low I.Q. pupils. At the 
junior high level there was the same consistent change from relatively poor 
work on the part of high I.Q. pupils to relatively better achievement on the 
part of low I.Q. pupils. Whereas at I.Q. level 138-142 pupils were about 
a year below X.A., at the 68-72 level they were about at X.A. 


With both groups it was very evident that superior pupils were failing 
to reach expected achievement goals in arithmetic fundamentals. This 
possibly is a problem of instruction related to enrichment and appropriate 
textbooks. The present program, however, seems to be meeting the need 
of the below-average child and at the elementary level their achievement 
is markedly above what might be expected. 


5. Language Skills. In the case of the language skills test at the ele- 
mntary school level all pupils except those in the highest I.Q. brackets, i.e., 
above 122 I.Q., were at or well above expectancy. At the junior high level 
the picture was not quite as good, as only pupils below 98 I.Q. tended to 
be at X.A. The very brightest of the junior high pupils were markedly 
below expectancy in language skills. With both groups it is evident that 
the enrichment of instruction for the gifted is an indicated need. 


6. Spelling. Results in spelling for both elementary and junior high 
pupils appeared good for pupils of 1.Q. 68 to 122. Gifted pupils (above 
I.Q. 132) in both instances were often failing to attain expectancy. 


Study of the charts comparing subject achievement with X.A. by I.Q. 
levels has shown: 


1, The need for appropriate testing of pupils at beginning and ending 
points of major school levels. 


2. The basic usefulness of the X.A. tables as goals for pupil achievement. 


3. The tendency for pupils of less than average intelligence to be work- 
ing at or above expectancy in arithmetic and language skills. 
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4. The tendency for pupils of above-average intelligence to be working 
below expectancy in arithmetic and language skills. 


5. The trend for brighter pupils to achieve well in reading vocabulary 
and comprehension as compared with relatively low attainment on an expect- 
ancy basis for pupils of less than average intelligence. 


Tables of Actual Achievement by Age and I.Q. 


A final step was to prepare tables of actual achievement in each of the 
subject fields tested by age and intelligence level. In format they appear 
much like an expected achievement table. The difference between them 
is that one is an ideal situation for all subject tests while the other presents 
the prevailing attainment by subject level. Since our instructional program 
apparently tends to favor certain types of pupil, attention given to pupil 
achievement at each age and intelligence level and by each subject level 
will show where help needs to be given in order to bring all pupils up to a 
level commensurate with their mental and chronological maturity. Another 
use of these new tables is to facilitate the study of the attainment of pupils 
in schools and classes without the usual graded organization. When such 
schools have their pupils’ achievement distributed and summarized by the 
same cell intervals it is possible to compare any cell for a given school with 
that of the achievement of all pupils of the same age and intelligence. Such 
tables of norms differentiated by age and I.Q. supplement the usual so-called 
“national norm” in that they make possible the interpretation of test achieve- 
ment for groups other than those of the average population. Such norms 
eliminate the confusing factor of actual grade. 

Since age and intelligence seem to have a greater bearing upon deter- 
mining a pupil’s achievement level than does his particular grade assignment, 
these factors are more useful in interpreting achievement results than are 
the usual so-called “national norms” based on grade means for an average 
population of 100 I.Q. 

When a pupil is young and of low ability and in a grade which actually 
represents acceleration he must compete with a national grade norm beyond 
his capacity. On the other hand, a pupil who is old and bright for his grade 
may either be under-tested by the achievement test which he is given or fail 
to do as well as he should because he has not yet been exposed to sufficiently 
advanced work. 


Summary 


In summary we may say that: 


1. The analysis of pupil test data by age and I.Q. has revealed the 
fact that age norms tend to be more basic than grade placement equivalents. 
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2. School districts whose pupils tend to be on the average above or 
below 100 I.Q. or whose age-grade relationship is different from that pro- 
vided by the test publishers will not be able to judge fairly their attainment 
by the use of the usual “national” grade placement norm. 


3. Where I.Q. is average but there has been a change in ageness result- 
ing from a change of promotion policy, better interpretation can be secured 
by the use of the age norms included in the test manual. 


4. Where marked differences occur with regard to both average intelli- 
gence and age-grade status, some sort of supplemental table should be used 
based on the achievement levels of pupils by increments of chronological 
age and intelligence. 


It is to be hoped that test publishers in the future will facilitate test 
interpretation by supplying such supplemental information as is described 


above. 


School districts themselves should strive to maintain a logical promotion 
policy that will not penalize the pupils who are measured on a grade norm 
basis. At the same time, they should make sure that pupils are neither 
under-tested or over-tested by the instruments used, thus failing to attain 
an achievement level commensurate with pupil ability. 


“Our Public Schools” is the title of the annual report of the Superintendent 
of Schools of the City of New York. Part III of the 1954-1955 report deals with 
the operation and maintenance of school buildings. It is presented as a separate 
booklet of very attractive format. It should prove of interest to administrators 
and others as an example of a way to make this type of reporting interesting to 
those who are not experts in business administration or building operation. 
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Current Trends in Test Construction in India 


JoHN ALLAN SMITH 


There’s a lot of research going on in Indian education; much more than 
one would realize. But it’s hard to get at. Most of it emerges as the 
by-product of graduate student effort in connection with professional degrees, 

India — the Federal Government of India, that is—is dedicated to 
research in the physical sciences. National laboratories for research in 
physics, in chemistry, in food technology, in etc., are to be found scattered 
throughout the country, As conspicuous and magnificent as many of these 
edifices are, national institutions for research into the social sciences are 
just as conspicuous for their absence. The Government of India has not 
felt quite the same need for research into the social sciences as it has for 
investigation into the physical sciences. Interest, however, is awakening. 

The Central Institute of Education, University of Delhi, has been 
specially charged by the Ministry of Education of the Government of India 
with responsibility for educational research. On the premises of the Central 
Institute of Education are also located the Ministry’s Central Bureau of 
Textbook Research and the Central Bureau of Vocational and Educational 
Guidance. So there is a beginning, but Indian education, like American, is 
a state affair and not a federal responsibility. Hence, it is to the state gov- 
ernments that Indians must eventually look for a measure of interest and 
leadership in the realm of educational research. 

There are no national, state, or district organizations, such as in America, 
dedicated specifically to educational research. There has been proposed, 
though, an All-Indian Association of Guidance and Counselling which 
unquestionably would have a strong research bias. Research at the state 
or district governmental level is negligible except for the few isolated 
vocational guidance bureaus, most of which have come into existence only 
during the past two or three years. 

There are a number of professional journals in education published in 
the various states of India. These carry occasional articles reporting or 








John Allan Smith was a Fulbright Exchange Professor to India in 1954-55, 
lecturing on problems of writing, editing, publishing, selecting, and using text- 
books. He is Supervisor, Research and Vocational Guidance for the Los Angeles 
City Schools, a position he has held since 1946, with time out for other assignments 
within the Los Angeles City Schools. He received the Ed.D. degree in 1948 at the 
University of Southern California. 
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stimulating interest in research. The Indian Journal of Educational Research, 
however, is the only periodical exclusively devoted to research in education. 
The new Psychology News Bulletin, published by the Psychology Division 
of the Ahmedabad Textile Industrial Research Association, contains infor- 
mation on tests and other research in India. The Journal of Vocational and 
Educational Guidance carries reports of tests and testing in connection with 
guidance services and personnel appraisal. 

This article is not and, of course, cannot be a complete survey of educa- 
tional research in India. It more properly is an overview of the impres- 
sions regarding tests and testing gained by the writer when visiting a number 
of Indian universities and teacher training colleges during 1954-1955 and 
through supplemental correspondence with Indian educationists (their term). 

When one reviews the titles of Indian theses and dissertations, he is 
instantly struck by the similarity of Indian research problems to those under- 
taken by master and doctoral candidates in America. One inescapable 
conclusion: Education is education, the world over. 

Without a doubt, most of the forceful Indian research has been in the 
area of educational psychology and guidance, particularly testing. The 
areas of supervision and administration have scarcely been touched. 

Curriculum and instruction have received moderate attention, Interest 
in evaluation has been increasing. Research into the historical aspects of 
Indian education has commanded spasmodic attention, but educational 
philosophy has received less consideration than it probably deserves. Prob- 
lems of teaching personnel, though, have evoked considerable study. 

One cannot be sure whether Indian interest in problems psychological 
is a reflection of that discipline’s popularity in the United Kingdom and 
the United States of America, or whether it is a concomitance of the accessi- 
bility of the data and the provocative nature of the problem. No matter. 
Much of Indian research is in the area of tests and testing. 


Tests and Testing 


Research into the development and standardization of testing instru- 
ments is under way—and necessarily, too—in all parts of India. Intelligence 
tests, group and individual, non-verbal as well as verbal; aptitude tests; 
educational achievement tests; vocational interest and personality inven- 
tories—all are being undertaken. The composition and standardization 
procedures of these tests are proceeding along accepted lines; but not 
without difficulty. India, I believe we can say, has passed beyond the stage 
of mere translation and substitution of “mango” for “orange” to devising 
instruments designed for its own culture. 

Among tests definitely of Indian origin are Mehta’s Hindi Test of 
Intelligence, Maiti’s Matrices (abstract intelligence), Samoohik Mansika 
Yogyta Pariksha, also known as Jalota’s Test of General Intelligence (Hindi), 
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Madras Tests for General Intelligence (non-verbal; also English), Bahtia’s 
Battery of Performance Tests (Hindi, achievement), Sohan Lal Allahabad 
Intelligence Test (Hindi), Pareek’s Clerical Aptitude Test, and the C.L.E. 
Admission Test for Prospective Teachers. 

The multiple language situation, though, is causing complications in 
the establishment of comparable norms. No one can be sure that a child 
attaining 100 I.Q. in Marathi, for example, has the same mental capacity 
as a child attaining 100 I.Q. in Gujarati. In addition to Hindi, the national 
language, there are fourteen recognized regional languages! 

In setting up a standardization population, caste introduces further 
difficulties. Much as has been said about caste differences, Indian census 
enumerators experienced difficulty in determining specific caste membership, 
Then, too, the increasing proportion of male pupils in school at the higher 
grades or standards introduces a systematic bias that also has to be sur- 
mounted in setting up a standardization population. The determination of 
the exact chronological age of a child is no mean problem either. In rural 
areas all that a parent may be able to state is that the child was born the 
year the cow died. Add to these: that most of this effort is without financial 
aid and has to be executed by students working for graduate degrees—and 
one can sympathize with his fellow researcher in India. Most university 
theses, furthermore, have to be reduced in size to the time that students 
can give them. Therefore, much Indian effort becomes, for example, a 
very limited sampling bearing such titles as, “Construction and Standardiza- 
tion of a Group Intelligence Test Suitable for Age Group Twelve, Thirteen 
and Fourteen” or “A Study of Goodenough’s Man Drawing Test of Intelli- 
gence with Indian Children of Age Groups 6+ to 10+ with a View to 
Establishing Norms for Their Intelligence.” 


Intelligence Tests 


Several universities are working on group and individual tests in the 
national language, Hindi. Among these are Lucknow, Banaras Hindu, 
Allahabad, Aligarh Muslim, Lucknow, Mysore, Patna, and Delhi; also the 
Uttar Pradesh Bureau of Psychology, Allahabad; the Central Institute of 
Education, Delhi; the A. G. Teachers’ College, Ahmedabad; the Bureau 
of Educational and Psychological Research, David Hare Training College, 
Calcutta; the Vocational and Educational Bureau, Bikaner, and the M.S. 
University of Baroda. Most are using the Stanford-Binet and Wechsler, 
substituting pictures, vocabulary, and problems to fit Indian culture and 
language. 


There is considerable effort to develop or adapt existing intelligence 7 
tests to-the regional languages—probably correctly so, as they are the media | 


of instruction in the elementary grades. Tests are being developed in 
Bengali, Tamil, Gujarati, Kannada, Marathi, Urdu, and Malayalam. Insti- 
tutions active in the promotion of this type of research include the Banaras 
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Hindu, Patna, Calcutta, and Baroda M. S. Universities; the Teacher Training 
College, Srinagar; the Vocational Guidance Bureau, Parsi Panchayat, Bom- 
bay, the B. M. Institute of Child Development, Ahmedabad; and St. 
Christopher’s Training College, Vepery, Madras. There may be others that 
did not come to the writer’s attention. The attempts of twenty years ago 
were merely translations of American or British tests with but minor adapta- 
tions to India. Present-day attempts are more sophisticated. 

Intelligence tests in English, likewise, are being developed for Indian 
children. English, the “international” language, though an optional subject, 
is still studied by the vast majority of secondary students and is also the 
general medium of instruction at the university level. Work in mental 
ability test construction and standardization for Indians speaking English, 
has been undertaken at St. Christopher’s Training College, Madras; Banaras 
Hindu University; Patna University, and the Vocational Guidance Bureau 
of the Parsi Panchayat, Bombay. 

Circumvention of the multiplicity-of-languages problem is being sought 
through development of performance and non-verbal tests, drawing upon 
United Kingdom and American sources. Experimentation is under way, 
utilizing Raven’s Progressive Matrices, Alexander’s Performance Scales 
(Passalong, Kohs’ Block Design, and Cube Construction tests), Moray House 
Picture Intelligence Test, along with Porteus Maze, Dearborn’s Formboard, 
and Goodenough’s Draw-a-Man tests. Delhi, Baroda, Patna, and Mysore 
Universities, together with A. G. Teachers’ College of the Gujarat Univer- 
sity, are taking a lead in this effort. 

The Faculty of Education and Psychology, M. S. University of Baroda, 
has reworked Goodenough’s Draw-a-Man test for Gujarati children. They 
have had to change some of the scoring procedures, as apparently no psycho- 
logical meaning attaches to sex differences in the human anatomy of young 
children. (Most Indian children wear “birthday clothes” until about four 
years of age.) Also, as most children go barefooted, toes are of no significance 
unless omitted from the drawing. 


Achievement Tests 


Attempts to measure the educational attainment of pupils by means of 
standardized tests is a much more recent innovation in Indian education 
than the measurement of intelligence. Parenthetically it may be added that 
neither type of test is widely used throughout India. It is only in areas close 
to universities doing research that intelligence and achievement tests are 
to be found. 

A commonly voiced criticism in India of achievement tests is that they 
do not fit the syllabus (course of study). What the Indian critic undoubtedly 
means is that the tests do not fit classroom instruction, This the tests 
probably never can do until India’s universal system of external examina- 
tions is modified or itself standardized. India follows the old British system 
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of determining a pupil’s eligibility to promotion to the next higher educa- 
tional level by externally—and, usually, subjectively—prepared examinations 
(e.g., School Leaving, Secondary School, Matriculation certificates). The 
classroom teacher, necessarily, teaches to the content of previous examina- 
tions to the neglect or disregard of the adopted syllabus. The reason: his 
professional fate—and that of his pupils and his school—often rests upon 
the percentage who successfully pass these “external” examinations. The 
test constructor, on the other hand, must rely upon the adopted syllabus to 
help him determine content. All too continuously, the twain never meet! 

Yet, when one examines the aggregate effort to devise standardized 
achievement tests, he is heartened by what he finds. 

The staff of the Central Institute of Education, Delhi, is in the process 
of producing and standardizing a series of achievement tests for elementary 
school subjects: general knowledge, mathematics, general science, geogra- 
phy, civics, economics, and Hindi. These will be used in making a survey 
of Basic Education in the State of Bihar. (Basic Education is an indigenous 
craft type of project learning in which the pupil learns the various school 
subjects concomittantly with instruction in agriculture, cotton ginning, 
weaving of cloth, lacquering, paper making, woodworking and _ similar 
“vocational” activities. ) 

Achievement tests have been devised for Hindi, English (a foreign 
language), and the regional languages such as Bengali and Tamil. Social 
studies have received attention with tests in Indian history, geography, 
civics, and general knowledge. Tests in arithmetic, algebra, geometry, and 
trigonometry as well as tests in general science, chemistry, physics, and 
home science have been written. Most of these have been done in partial 
satisfaction of the requirements for an advanced professional degree in 
education and cover only a single grade or class. 

Indian interest in achievement testing on a standardized basis definitely 
is increasing. Each university certainly receives one or more theses each 
year devoted to the measurement of attainment in some secondary school 
subject. The elementary field remains untouched, probably because students § 
receiving master’s and doctor’s degrees are interested in secondary or higher | 
education. Research into the use of achievement tests for diagnostic pur- 
poses is yet to get under way. No publisher producing standardized tests 


on a commercial basis came to the attention of the writer during his stay 
in India. 


Guidance Tests 


India’s concern about the effect of unemployment and un-utilized educa- § 
tion upon morale, social unrest, and economic stability is reverberating to 
stir interest in vocational guidance. This in turn is underscoring the need 
for a battery of measures for aptitudes, vocational interests, and personality 
traits, Attempts are in the initial stage of using directly or adapting available 
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tests and inventories from U.K. or the U.S.A. Thus there are reports of 
establishing local norms for the Minnesota Form Board, Clerical, and Rate 
of Manipulation tests, and Bennett’s test of Mechanical Comprehension. 
The Differential Aptitude Test battery, particularly the abstract reasoning 
test, is being tried out in several locales. 


Aptitude Testing 


University departments of psychology and the several bureaus of voca- 
tional guidance have carried major responsibilities for constructing and 
standardizing aptitude tests. The aptitudes measured have followed in the 
wake of the researcher’s interests or the counselor's needs. No apology 
need be offered for the small, stratified population samples on which the 
tests usually have been standardized and normed. Such must be anticipated 
in a country just getting under way in test research. 

Professor P. K. Roy’s description of the construction of a clerical aptitude 
test at the Central Institute of Education, Delhi, is typical of the procedure 
followed: 


This was an adaptation of Minnesota Vocational Test for clerical workers. 
The test was adapted to Delhi situation by changing certain items of the Part II 
of the test. The sample consisted of six different groups of one hundred each. 
These groups included boys and girls in the Higher Secondary Schools and clerks 
in the offices. Both Arts and Commerce students were given the test. The clerks 
group included inexperienced, experienced general and account clerks. The test 
of significance was applied to find out the difference in the scores of the various 
groups selected for the study. Norms were prepared by calculating percentiles. 
Test retest reliability was found to be .87 for Part I and .83 for Part II. (3) 


The following is a summary of institutions known to be doing construct- 
ing, adapting, or standardizing of aptitude tests: 

Art Aptitude—State Vocational Guidance Bureau, Bombay (Design 
Judgment); Department of Psychology, M. S. University of Baroda (art 
appreciation ). 

Clerical Aptitude—Central Institute of Education, University of Delhi; 
Vocational Guidance Bureau, Parsi Panchayat, Bombay; the Faculty of 
Education and Psychology, M. S. University of Baroda; the Institute of 
Psychological Research and Service, Patna University; Department of 
Psychology, Calcutta University. These generally are patterned after 
American models. The Vocational and Educational Bureau, Bikaner, is 
developing a Hindi version of the Parsi Panchayat clerical aptitude test. 

Educational or Teaching Aptitude—Central Institute of Education, 
University of Delhi; Bureau of Educational and Psychological Research, 
David Hare Training College, Calcutta; Department of Psychology and 
Education, University of Lucknow; Department of Psychology, Mysore 
University, 

Engineering Aptitude—Institute of Psychological Research and Service, 
Patna University; Department of Psychology, Calcutta University; Educa- 
tional Guidance Services, Teachers’ College, Saidapet, Madras. 
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Leadership Aptitude—The Institute of Psychological Research and Serv- 
ices, Patna University (principally personality traits). 

Mathematical Aptitude—Department of Psychology and Education, Uni- 
versity of Lucknow; Bureau of Educational Research, Ewing Christian 
College, Allahabad. 

Mechanical Aptitude—Departments of Psychology, M. S. University of 
Baroda and University of Mysore (manual and tweezer dexterity); David 
Hare Training College, Calcutta (mechanical aptitude, Bengali). (Trade- 
tests, which are actually work tasks supplemented by oral questions, have 
been devised by the Northern Railway for fitters, turners, tinsmiths, welders, 
carpenters, and polishers. ) 

Police Work Aptitude—Bureau of Educational and Psychological Re- 
search, David Hare Training College, Calcutta; Uttar Pradesh State Bureau 
of Psychology, Allahabad. 

Scientific Aptitude—The David Hare Training College, Calcutta; Bureau 
of Educational Research, Ewing Christian College, Allahabad (physics- 
chemistry aptitude). 

Social Work Aptitude—School of Social Work, M. S. University of 
Baroda; State Bureau of Vocational Guidance, Bombay. 

Stenographic Aptitude—Vocational Guidance Bureau, Parsi Panchayat, 
Bombay. 


Personality Measurement 


Measurement of personality traits is an attention-absorbing activity in 
India. Instruments for assessing general personality characteristics or for 
indicating temperament of occupational groups have been attempted. The 
following institutions report having devised or being in the process of devis- 
ing or adapting personality inventories: Department of Psychology, Calcutta 
University; Institute of Psychological Research and Service, Patna Univer- 
sity; Department of Philosophy and Psychology, Lucknow University; 
Vocational Guidance Bureau, Parsi Panchayat, Bombay; Bureau of Educa- 
tional and Psychological Research, David Hare Training College, Calcutta 
(for teachers); Laboratory of Experimental Psychology, Banaras Hindu 
University; Department of Psychology, Mysore University. The Parsi 
Panchayat’s inventory, in turn, has been adapted into Hindi by the Voca- 
tional and Educational Bureau, Bikaner. 


Vocational Interest Inventorying 


Vocational Interests—Vocational interest inventories have been or are 
in process of development by the Vocational Guidance Bureau of the Parsi 
Panchayat, Bombay; State Vocational Guidance Bureau, Bombay; the Insti- 
tute of Psychological Research and Service, Patna University (patterned 
after Strong); the Department of Psychology, Mysore University. 
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Summary 


Interest in educational research in India is growing apace with major 
production concentrated at the master’s degree level in the form of thesis 
writing. This is being supplemented in the area of testing by day-to-day effort 
of the several bureaus of vocational guidance and psychological research 
scattered throughout the country. Under these conditions few of the 
broad problems of Indian education can be attacked through research. 

Research into the areas of testing and test construction is proceeding 
from the secondary level and probably will grow downward into the ele- 
mentary level as graduate students interested in younger pupils appear for 
professional degrees. The point of departure in test construction has moved 
from the translation of existing British and American tests to more sophisti- 
cated adaptations of such instruments to Indian culture and conditions. A 
few tests already have appeared which can be regarded as indigenous to 
India. Undoubtedly there will be others as India builds its own pool of 
educational researchers. 

There are many barriers to research in India. Lack of finance, lack of 
qualified researchers, lack of time, indifference, the multiple language situa- 
tion, India’s own culture often stand in the way. 

Techniques of research in the social sciences are not as yet well known, 
and most Indian training colleges have not yet felt the impact (or lack) of 
skillfully executed research upon their own status as educational institutions. 
The inter-university exchange of information about research in progress or 
completed is practically unknown. There is no adequate avenue or medium 
to facilitate such communication. Yet it is heartening to report that the 
most worn book in any Indian teachers’ college is likely to be Monroe’s 
Encyclopaedia of Educational Research. 
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A PRIMER OF SOCIAL STATISTICS 


Sanford M. Dornbusch and Calvin F. Schmid 
New York: McGraw-Hill Book Company, Inc., 1955. 251 pages. $4.75. 


While this book is listed in the McGraw-Hill series in sociology and 
anthropology, it should not be overlooked by instructors and students of 
beginning statistics in education. It makes a sincere effort to bring to 
typical undergraduates, and perhaps numerous graduates, an understanding 
of social statistics and skill in their primary usages in spite of deficiencies 
in mathematics. For example, the book is so written that even the student 
without algebra can make headway in it. 

The book covers early statistics through linear correlation and contin- 
gency. It includes in an appendix usual tables of squares and square roots, 
areas under the normal curve, t values, levels of significance, and others, 
While the examples given to illustrate principles and operations tend to 
come from the sociology field rather than the field of educational measuring, 
there would appear to be no great disadvantage unless the instructor was 
of the belief that the latter orientation should be present in the text. 

The book is richly illustrated and is of attractive format and composition. 
Classroom teachers who are looking for an additional reference book which 
is not too technical and difficult to follow will find A Primer of Social 
Statistics to their liking. The authors are well recognized in the field of 
sociology and demography. 


PREPARATION OF CORE TEACHERS FOR 
SECONDARY SCHOOLS 


By the Committee on Preparation of Core Teachers of the ASCD 


Washington, D.C.: Association for Supervision and Curriculum Development, 1955. 
96 pages. $1.25. 

The book begins by defining a core program as one that “consists of 
two or more periods per day, during which teachers and pupils may engage 
in a wide variety of significant activities. These activities may include 
identifying problems, setting objectives, group planning, committee investi- 
gations, field trips, classroom forums, corrective work in speech, writing and 
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reading, practice in the skills of group work, dramatizing an original play, 
using community resource people, doing library research, summarizing, 
organizing and interpreting results of study.” There is, in the opinion of 
the Committee, an increasing need for teachers for such a program. 

The core teacher must have effective skill and competence in five areas, 
according to the findings of the authors. These areas are: 


“Understanding the adolescent and helping to meet his needs 


Using major fields of knowledge as resources for studying and solving com- 
mon problems 

Providing leadership in the use of democratic group processes 

Counseling and guidance 

Organizing and utilizing learning materials.” 


It is proposed that the training of core teachers be based upon a broad 
foundation of general education. Electives should be used to fill in personal 
deficiencies, permit pursuit of special interests, and round out the indi- 
vidual’s general culture and ability to express himself in many media. 
Professional education would, among other things, heavily emphasize the 
development of competence in the use of group processes. 


MODERN PHILOSOPHIES AND EDUCATION 


The Fifty-fourth Yearbook of the National Society for the 
Study of Education, Part I 


Chicago: University of Chicago Press, Distributors, 1955. 374 pages. $4.00 cloth, 
$3.25 paper. 

This volume is intended to be supplementary to the Society’s Forty-first 
Yearbook on Philosophies of Education. The primary difference between 
the volumes is that the new one concerns the applications of general philoso- 
phies to education, rather than the statement of educational philosophies as 
such. Under the chairmanship of John S. Brubacher, nine outstanding 
philosophical writers have each outlined the implications they feel a certain 
philosophical framework has for certain vital educational problems. Each 
writer was assisted by an educational consultant who helped to overcome any 
handicap that the philosopher might have due to lack of familiarity with 
educational practices. 

Each author was asked by the chairman to cover six points—namely, 
(1) a statement of his general philosophical orientation, (2) aims, values, 
and curriculum, (3) the educative process, its methods, motivation, and 
the like, (4) the relationship of the school to society, (5) the relation of 
the school to the’ individual, and (6) religious and moral education. Most 
of the authors followed this outline very commendably. Those that deviated 
markedly from it apparently did so because their philosophy was such as 
to render certain of these problem areas meaningless as stated. 
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There would seem to be no reason to deny that this book will serve a 
useful purpose as a source to which educators can turn in order to get a 
better understanding of what some of the philosophical critics of modem 
schools seem to have in mind. However, it does not seem likely that it will 
serve to provide a framework that will prove useful in formulating problems 
amenable to research studies. In spite of the use of educational consultants, 
the various chapters seem only to represent different degrees of abysmal 
ignorance of education on any level except that of the college and graduate 
school. 

The most completely disconcerting aspect of the book, however, is its 
revelation of the extent to which most philosophies depend upon a play on 
words. It almost seems as though some of the authors deliberately decided 
to suppose certain unlikely ideas in order to be different from and to argue 
with other thinkers. If this aspect of the publication should lead some 
thoughtful educator to formulate a reasonable and functional educational 
philosophy for American public schools, it will have served a useful purpose 
beyond that intended by its editors. 


UNDERSTANDING PEOPLE IN DISTRESS 


By Barney Katz and Louis P. Thorpe 
New York: The Ronald Press Company, 1955. 357 pages. $4.00. 


Barney Katz is a practicing clinical phychologist, a lecturer on mental 
hygiene at the University of Southern California and the Los Angeles 
County Hospital, and a fellow of the American Psychological Association. 
Louis P. Thorpe is professor of education and psychology at USC and a 
former director of that institution’s psychological clinic. Both men are 
fully capable of writing a book of a highly technical nature. In this instance, 
they have not chosen to do so. 

The purpose of the present volume appears to be primarily that of 
providing accurate information in non-technical form. Thus the book will 
prove to be particularly valuable to laymen. It may be questioned whether 
or not teachers and other educational workers should be considered strictly 
in this category with respect to psychological matters; but there can be no 
doubt that many of them will appreciate this publication. It should prove 
particularly valuable as supplementary reading for those who are just 
beginning the phychological part of teacher training programs. 

One of the best features of the book is the relaxed, impartial style with 
which various behavioral problems are considered. If teachers, or prospec- 
tive teachers, can absorb some of this attitude and apply it in their day-to- 
day dealings with children, the authors will have done a good service. 


48 





CALIFORNIA JOURNAL OF EDUCATIONAL RESEARCH 


Published by 
California Teachers Association 
693 Sutter Street 
San Francisco 2, California 


INFORMATION AND INSTRUCTIONS 
REGARDING MANUSCRIPTS 


The California Journal of Educational Research welcomes original 
manuscripts on educational research. The types of materia] desired are: 


1. City and county school research pertaining to curriculum, guidance 
and counseling, evaluation, supervision, and finance. 


2. Digests of theses and dissertations that have practical application. 
Such theses and dissertations, however, must officially be approved 
for publication by a member of the Editorial Board (see names on 
inside of front cover) or by the college or university at which the 
research was done. 


. Studies that present novel, but tested, approaches to the solution 
of educational problems. 


Manuscripts, except for feature articles, should be limited to approxi- 
mately 1500 words. They should be typewritten double-spaced, on one 
side of the paper, and submitted in duplicate. Only original manuscripts 
will be accepted for publication. 


Tables, charts and graphs often enhance a research report. When 
used, they should be clearly and accurately labeled. They should also be 
inked to size and their places designated in the manuscript. 


Footnotes should be complete as to author, title, publisher, date and 
pages. Bibliographies accompanying manuscripts should also be complete 
as to data, and should be arranged alphabetically by authors. 


Manuscripts will not be returned unless the author so requests, in which 
case a stamped, self-addressed envelope should accompany the manuscript. 





